Language ID for a Thousand Languages
نویسندگان
چکیده
منابع مشابه
Rhyming Compounds as Elements of a Language Game (In Russian and English Languages)
The article is devoted to the study of composite rhyming compounds as a means of word formation games. It explores the place of this category of words in the lexical system and peculiarities of their use in the Russian and English languages. Authors of the article represent compound words as a special lexical subgroup. On the specific publicistic material are revealed the peculiarities of compo...
متن کاملA Natural Language Generator for Minority Languages
The Bible Translator’s Assistant (TBTA) is a natural language generator (NLG) designed specifically for field linguists doing translation work in minority languages. In particular, TBTA is intended to generate drafts of the narrative portions of the Bible as well as numerous community development articles in a very wide range of languages. TBTA uses the rich interlingua approach. The semantic r...
متن کاملShedding (a Thousand Points of) Light on Biased Language
This paper considers the linguistic indicators of bias in political text. We used Amazon Mechanical Turk judgments about sentences from American political blogs, asking annotators to indicate whether a sentence showed bias, and if so, in which political direction and through which word tokens. We also asked annotators questions about their own political views. We conducted a preliminary analysi...
متن کاملPictures worth a thousand tiles, a geometrical programming language for self-assembly
We present a novel way to design self-assembling systems using a notion of signal (or ray) akin to what is used in analyzing the behavior of cellular automata. This allows purely geometrical constructions, with a smaller specification and easier analysis. We show how to design a system of signals for a given set of shapes, and how to transform these signals into a set of tiles which self-assemb...
متن کاملPhonetic knowledge, phonotactics an automatic language id
This study explores a multilingual phonotactic approach to automatic language identification using Broadcast News data. The definition of a multilingual phoneset is discussed and an upper limit on the performance of the phonotactic approach is estimated by eliminating any degradation due to recognition errors. This upper bound is compared to automatic language identification based on a phonotac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: LSA Annual Meeting Extended Abstracts
سال: 2010
ISSN: 2377-3367
DOI: 10.3765/exabs.v0i0.504